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A general real-space multigrid algorithm for the self-consistent solution of the Kohn-Sham equa- 
tions appearing in the state-of-the-art electronic-structure calculations is described. The most im- 
portant part of the method is the multigrid solver for the Schrodinger equation. Our choice is the 
Rayleigh quotient multigrid method (RQMG), which applies directly to the minimization of the 
Rayleigh quotient on the finest level. Very coarse correction grids can be used, because there is no 
need to be able to represent the states on the coarse levels. The RQMG method is generalized for 
the simultaneous solution of all the states of the system using a penalty functional to keep the states 
orthogonal. The performance of the scheme is demonstrated by applying it in a few molecular and 
solid-state systems described by non-local norm-conserving pseudopotentials. 



I. INTRODUCTION 

One of the goals of computational materials science 
is to calculate from first principles the various physical 
and chemical properties. This requires the solution of 
the electronic and ionic structures of the materials sys- 
tem in question. The density-functional theory (DFT) 
makes a huge step towards this goal by casting the un- 
tractable problem of many interacting electrons to that 
of noninteracting particles under the influence of an ef- 
fective potentiala. The adiabatic approximation allows 
one to separate the ionic degrees of freedom from those 
of the electrons. However, in order to apply DFT in 
practice one has to resort to approximations for electron 
exchange and correlation such as the local-density ap- 
proximation (LDA) or the generalized-gradient approxi- 
mation (GGA). Moreover, in the case of systems consist- 
ing of hundreds or more atoms it is still a challenge to 
solve numerically efficiently for the ensuing Kohn-Sham 
equations. 

The numerical solution of the Kohn-Sham equations 
is the concern of our present work. It deals with real- 
space (RS) methods, in which the values of the dif- 
ferent functions are presented using three-dimensional 
point grids, and the partial differential equations are dis- 
cretized using finite differencesS'0. The RS methods, as 
suggested by the name chosen, are .contrasted with the 
popular plane-wave (PW) schemescl'El. There are several 
aspects favouring the RS methods over the PW meth- 
ods. Both of the methods are used in the context of 
pseudopotentials describing the electron-ion interactions, 
but only the RS can easily be used in all-electron calcu- 
lations or with hard pseudopotentials of, i.e. first-row 
or transition metal atoms, because the RS grid can be 
refined m a natural way in the ion. core regions (compos- 
ite gridajQ, adaptive coordinatesQO). Systems, such as 
surfaces, containing different length scales are more eco- 
nomically described in the RS than in the PW scheme 
because one needs not waste many grid points in the 
vacuum regions to describe the slowly varying tails of 



wave functions. In the RS methods periodic boundary 
conditions are not necessary. This leads to ease and ac- 
curacy in describing charged atomic clusters in contrast 
to PW methods requiring an artificial neutralizing back- 
ground charge. Besides the above "physical arguments" 
there are also methodological and computational aspects 
favouring the RS methods. The RS methods allow a sys- 
tematic convergence control by increasing the grid (or 
basis function) density. (The PW methods do also so by 
adjusting the cutoff energy of the plane wave expansions.) 
The so-called "order-N" methodsO, the computational 
cost of which scales linearly with the number of elec- 
trons, require localized real-space wave functions leading 
naturally to the employment of RS methodalj. The dis- 
cretizations in the real-space grid can be made local and 
therefore parallelization can effectively use data decom- 
position in which different real-space regions are handled 
with different processing units and the communications 
between processing units will be mainly short-rangecO. 

More specifically, o-uXpChoice for the numerical method 
is a multigrid schemeutx Several approaches employing 
the multigrid idea within electronic. strupt.uj£-palculations 
have appeared during recent years l 16 H 14 n 17 l pH 18 l The main 
idea of multigrid methods is that they avoid the critical 
slowing-down (CSD) phenomenon occuring when a par- 
tial differential equation discretized on a real space grid 
is solved with a simple relaxation method such as the 
Gauss-Seidel method. The discretization operators typi- 
cally use information from a rather localized region of the 
grid at a time. Therefore the high frequency error of the 
length scale of the grid spacing is reduced very rapidly in 
the relaxation. However, once the high frequency error 
has effectively been removed, the very slow convergence 
of the low frequencv-components dominates the overall 
error reduction ratec3, i.e. CSD occurs. In multigrid 
methods one stops the relaxation on a given (fine) grid 
before CSD sets in and transfers the equation to a coarser 
grid (the so-called restriction operation) where the low- 
frequency components can be solved more efficiently. On 
the coarsest grid the problem is solved exactly or as ac- 
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curately as possible, after which one interpolates (the 
so-called prolongation operation) the correction to finer 
grids, performing simultaneously relaxations in order to 
remove the high-frequency errors introduced in the inter- 
polation. 

The solution of the Poisson equation by multigrid 
methods is straightworwardtLj. This is because the error 
(or the correction needed) also obeys a Poisson equation 
and thus will be a smooth function to be presented and 
solved on the repeatedly coarser grids optimal to handle 
the lower frequencies. The solution of an eigenvalue prob- 
lem, such as the Schrodinger equation, is a much more 
complicated task than that of the Poisson equation. The 
problem is no more linear because both the eigenfunc- 
tion and the eigenvalue have to be solved simultaneously. 
Then the error no longer obeys the same equation as 
the solution. Also one has to solve for several eigenpairs 
(eigenvalues and corresponding eigenvectors). Moreover, 
the existence of both negative and positive eigenvalues 
makes the problem indefinite. This implies severe dif- 
ficulties for many simple iterative methods which con- 
verge only in the case of a positive definite iteration ma- 
trix. In particular, it can easily be shown that using 
Gauss-Seidel relaxation for the Schrodinger equation the 
high frequency components typically converge as in the 
case of the Poisson equation, but the low frequency com- 
ponents may diverge, although the divergence may be 
slowEij. More complicated methods such as Kaczmarz re- 
laxation are guaranteed to converge, but may have clearly 
inferior high frequency reduction rates, which are essen- 
tial for the overall speed of multigrid methods. Other 
possible convergent methods include GMRESEj which is 
considerably more complex than Gauss-Seidel relaxation. 

A standard recipe for dealing with eigenproblems 
with multigrids is the full-approximatiofti-storage (FAS) 
method originally described by BrandtEi. In FAS one 
solves for the entire problem on the coarse grids also and 
ends up in solving for a properly modified problem so that 
its solution can be used in correcting the fine grid solu- 
tion. The FAS method may not be very straighworward 
to implement for the Schrodinger equation. It is also 
difficult to present some actual potential on the coarse 
levels accurately enough. However, some succesful appli- 
cations of FAS have appeared in the context of electronic 
structure calculations by Beck et aZ.EUifl and advanced 
strategies for FASiiiave been proposed^. 

Briggs et aZ.ll3lI-3 employ a multigrid method in 
electronic structure calculations by linearizing the 
Schrodinger problem and presenting the potential con- 
tribution on the coarse levels by an error term (residual) 
only. Then on the coarse levels they salve effectively 
for the Poisson problem. Ancilotto et al\t3 modified the 
method by Briggs et al. by shifting to a full multi- 
grid (FMG) scheme and by solving on the coarse grids 
a problem including a local potential term. The idea of 
FMG is to start the smoothing iterations from a coarse 
grid. Then the interpolation to a finer grid provides a 
good initial guess of the solution. The FMG scheme 



can accelerate the convergence remarkably with respect 
to the (above-described) V-cycle scheme in which one 
starts from the finest level. Fattebertu used a multigrid 
method with a block Galerkin inverse iteration (BGII) 
and GMRES in the relaxations. In the method, the 
current approximation is kept orthogonal against all the 
nearby states during the multigrid cycle. The inverse it- 
eration converges for a given guess for the energy eigen- 
value towards the nearest eigenvalue. In order to solve 
all the desired lowest eigenstates a good guess for the 
eigenvalue spectrum is needed in the beginning of iter- 
ations, but thereafter large computational savings may 
be expected because explicit orthogonalizations are not 
needed (at least between well-separated states). 

A severe problem in the existing multigrid schemes for 
the Schrodinger equation is often that the coarse grids 
cannot well approximate the solutions of the coarse grid 
equations themselves. As a consequence the correction 
from coarse grids, no matter how accurately the equation 
is solved, may be ineffective in correcting the fine grid so- 
lution and as a result the overall process converges slowly. 
Therefore one has to restrict to the use of rather fine grids 
only and the convergence speed of the scheme is drasti- 
cally lowered. In those multigrid methods, which use the 
potential also on the coarse grids the size of the .sparsest 
grid has been typically of the order of 31x31x31ErtLZl. Us- 
ing the FAS method coarser grids are possible at least for 
systems with a small number of eigenstates solvedEi If 
a large number of eigenstates have to be solved problems 
may arise because the coarse grids may not be able to 
represent eigenstates with many nodes or the ordering of 
the states may change between the successive grids. To 
bypass these problems in FAS, rather complicated strate- 
gies are neededEI 

In order to avoid the coarse grid representation 
problems we utilize the so-called Rayleigh Quotient 
Multigrid (RQMG) method introduced by Mandel and 
McCormicknil. In this method the coarse grid relaxation 
passes are performed so that the Rayleigh quotient calcu- 
lated on the fine grid will be minimized. In this way there 
is no requirement for the solution to be well represented 
on a coarse grid and the coarse grid representation prob- 
lem is avoided. Mandel and McCormickEil introduced 
the method for the solution of the eigenpair correspond- 
ing to the lowest eigenvalue. We have generalized it to 
the simultaneous solution of a desired number of lowest 
eigenenergy states by developing a scheme which keeps 
the eigenstates separated by the use of a penalty func- 
tional, Gramm-Schmidt orthogonalization, and subspace 
rotations. Our generalization of the RQMG method is an 
attractive alternative for large-scale electronic structure 
calculations. 

The Kohn-Sham equations have to be solved self- 
consistently, i.e. the wave functions solved from the 
single-particle equation determine via the density (so- 
lution of the Poisson equation and the calculation of 
the exchange-correlation potential) the effective poten- 
tial for which they should again be solved. To approach 
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this self-consistency requires an optimized strategy so 
that numerical accuracy of the wave functions and the 
potential increase-4n balance, enabling the most effi- 
cient convergences. In order to avoid the divergence 
of the self-consistency iterations, the mixing of the input 
and output solutions is needed. For this feedback pre-, 
cedure sophisticated schemescS and control strategiesO 
have been presented. 

The outline of the present paper is as follows. In Sec- 
tion II we represent shortly the most important ideas of 
the density functional theory. Section III is devoted for 
numerical methods, the most important of which is the 
Schrodinger equation solver developed. Also the strat- 
egy for the self-consistency iterations is discussed. In 
Section IV we demonstrate by the help of a couple of 
examples the performance of our scheme in calculating 
the electronic structures of small molecules and solid- 
state systems described by pseudopotentials. Section V 
summarises the work and gives outlines for the future 
developments. 



II. THE KOHN-SHAM SCHEME 

In the Kohn-Sham method for electronic structure 
calculations^ one solves for a set of equations self- 
consistentlyty. In the following, we present the equations 
in the spin-compensated form. In practice, we have made 
the straightforward generalization using the spin-density 
functional theory. The set of equations reads as (atomic 
units with h — m e = e = 1 are used): 
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N 



i (r)=^|vl/ 4 (r)| 2 , 



^eff(r) = ^on(r) + Vk(r) + VScc(r), 



^xc(r) 



SExc [n(r) 
Sn(r) 



(1) 

(2) 

(3) 
(4) 

(5) 



The first equation ([!]) is a Schrodinger equation for non- 
interacting particles in an effective potential V c a(r). For 
finite systems the wave functions are required to van- 
ish at the boundaries of the computation volume. In 
the case of infinite periodic systems the complex wave 
functions have to obey the Bloch theorem at the cell 
boundaries. The electron density n(r) is obtained from a 
sum over the N occupied states. The effective potential 



consists of an external potential Vi on (r) due to ions (or 
nuclei in all-electron calculations), the Hartree potential 
Vk(r) calculated from the electron density distribution, 
and the exchange-correlation potential Vxc ( r ) ■ In the ex- 
amples of the present work we use the norm-conserving 
non-local pseudopotentials for the electron-ion interac- 
tions and the local-density approximation (LDA) for the 
exchange-correlation energy 



Exc[n(r)\ = J e X c(rc(r))n(r)dr, 

and for the exchange-correlation potential 

dexc 



Vxc(r) = exc(n(r)) + ra(r)- 



dn |n=n(r) 



(6) 



(7) 



The Hartree potential is solved from the Poisson equa- 
tion 



V 2 y H (r) = -47rra(r). 



(8) 



In practice, the electron density n(r) is substituted by 
the total charge density p(r), which includes the positive 
ionic (nuclear) charge neutralizing the system. In the 
case of finite systems, Dirichlet boundary conditions are 
used with the Coulomb potential values calculated using 
a multipole expansion. For periodic systems we fix the 
average Coulomb potential to zero and allow the peri- 
odic boundary conditions to result in the corresponding 
converged potential. 

The self-consistent solution of the above Kohn-Sham 
equations leads to the ground state electronic structure 
minimizing the total energy 



+ J V ion (r)n(r)dr + E xc + Eh 



Vn(r)n(r)dr 



ion — ion i 



(9) 



where ^ion-ion is the repulsive interaction between the 
ions (nuclei) of the system. Instead of the self-consistency 
iterations the solution of the Kohn-Sham problem can be 
found by minimizing directly the total energy with re- 
spect to the wave function parameters, e.g. plajagj-wave 
coefficients!! However, Kresse and FurthmullemH have 
found this scheme less efficient than the self-consistency 
iterations. 



III. NUMERICAL METHODS 

A. Schrodinger equation solver 

In our real space method we start from an initial guess 
for the effective potential and initial wave functions gen- 
erated by random numbers in grid points. The wave 
functions and the Hartree potential are updated alter- 
natingly towards self-consistency. The solution of the 
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Poisson equation is a standard task for the multigrid 
scheme. If a reasonable guess for the Coulomb poten- 
tial is not available, the FMG method will produce the 
solution starting from random numbers and requiring the 
work which scales linearly as a function of the size of the 
system (O(N)). During the Kohn-Sham iterations one 
can start from the present approximation of Coulomb 
potential and update it with respect to the new charge 
density by performing only a few V-cycles. 

The solution of the wave functions is a much more com- 
plicated task than that of the Poisson equation because 
one has to solve an eigenvalue problem which in the state- 
of-the-art electronic structure calculations means the de- 
termination of several hundreds of eigenpairs. For this 
purpose we have developed a scheme based oh-JIQMG 
method introduced by Mandel and McCormickcil for the 
solution of the eigenpair corresponding to the lowest 
eigenvalue. We begin by reviewing the basic principles 
of RQMG. This is most easily done in the framework of 
the so-called coordinate relaxation method. Thereafter 
we go through the modifications made in order to simul- 
taneously solve for several eigenpairs. 

Coordinate relaxation is a method of solving the dis- 
cretized eigenproblem 



Hu = XBu 



by minimizing the Rayleigh quotient 



{u\H\u) 
{u\B\uY 



(10) 



(11) 



Above, H and B are matrix operators chosen so that the 
Schrodinger equation discretized on a real-space point 
grid with spacing h is satisfied to a chosen order 0(h n ). 
In Eq. (|ll) u is a vector containing the wave function 
values at the grid points. In the relaxation method, the 
current estimate u is replaced by itself plus a multiple of 
some search vector d 



u + ad, 



(12) 



and a is chosen to minimize the Rayleigh quotient. This 
leads to a simple quadratic equation for a. (Find the 
minimum of the expression (|l4|) below with respect to a. 
In the case of a complex wave function one has to solve 
for the real and imaginary parts of a from a coupled pair 
of quadratic equations.) Moreover, the search vector d is 
simply chosen to be unity in one grid point and to van- 
ish in all other points. A complete coordinate relaxation 
pass is then obtained by performing the minimization at 
each point in turn and these passes can be repeated until 
the lowest state is found with desired accuracy. 

Naturally, also the coordinate relaxation suffers from 
CSD because of the use of local information only in up- 
dating u in a certain point. In order to avoid it one 
applies the multigrid idea. In the multigrid scheme by 
Mandel and McCormickcJ the crucial point is that coarse 
grid coordinate relaxation passes are performed so that 



the Rayleigh quotient calculated on the fine grid will be 
minimized. In this way there is no requirement for the 
solution to be well represented on a coarse grid. In prac- 
tice, a coarse grid search substitutes the fine grid solution 
by 



if + al[d c 



(13) 



where the subscripts / and c stand for the fine and coarse 
grids, respectively, and // a prolongation operator in- 
terpolating the coarse grid vector to the fine grid. The 
Rayleigh quotient to be minimized is then 



{uf+al^ d c \Hf \uf-\-al£ d c ) 

Auf+all d a ) 

(uf\H f Uf)+2a{If H f u f \d c )+a 2 {d c \H c d c ) 
(u f \BfUf)+2a(I c ,B j -u } \d a )+a' 2 (d c \B c d c ) ' 



<« / +aI t c d c \B f \u s +aI t c 



(14) 



The second form is obtained by relating the coarse grid 
operators, H c and B Cl with the fine grid ones, Hf and 
Bf, by the Galerkin condition 



H c 

B r 



= IfHfli 
= IfBfll, 



(15) 



and the restriction operator has to be the transpose 
of the prolongation operator 



'/ 



(16) 



The key point to note is that when HfUf and BfUf are 
provided from the fine grid to the coarse grid, the re- 
maining integrals can be calculated on the coarse grid 
itself. Thus one really applies coordinate relaxation on 
the coarse grids to minimize the fine level Rayleigh quo- 
tient. This is a major departure from the earlier meth- 
ods, which to some extent rely on the ability to represent 
the solution of some coarse grid equation on the coarse 
grid itself. Here, on the other hand, one can calculate 
the exact change in the Rayleigh quotient due to any 
coarse grid change, no matter how coarse the grid itself 
is. There is no equation whose solution would have to be 
representable. 

Thus, in the Rayleigh quotient minimization multigrid 
(RQMG) algorithm the coordinate relaxation passes on 
each level keep track of the integrals in Eq. (|l4|). Actu- 
ally, on the finest level we use Gauss-Seidel relaxation, 
which very effectively smoothens the errors of the wave- 
length corresponding to the grid spacing. When calcu- 
lating several eigenpairs Gauss-Seidel relaxation may also 
work as a residual minimization method. The idea is that 
the coarse grid-iterations with Gramm-Schmidt orthogo- 
nalization can provide the separation of the eigenstates so 
well that the subsequent finest level relaxations converge 
to the correct (nearest) eigenstates without orhogonal- 
ization. This requires that the effect of the coarse-level 
smoothing^ on the low- frequency components of the solu- 
tions overcomes the possible divergence tendency of these 
components caused by the Gauss-Seidel relaxation on the 
finest level. 
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Moreover, we discretize the original equation sepa- 
rately on each grid (discretization coarse grid approxi- 
mation (DC A)) instead of using the Galerkin conditions 
of Eq. ( [TBI) This may in principle decrease the conver- 
gence rate and force a limit to the coarsest possible grid 
in order to avoid instability or divergence. However, we 
have observed this DCA implementation of RQMG to be 
quite stable and efficient. To avoid possible coarse- level 
instabilities occuring especially during the first few it- 
eration cycles we may recalculate the Rayleigh quotient 
whenever coarse grid corrections are interpolated to a 
finer grid. Later when approaching the convergence the 
recalculation can be omitted. 

For the matrix operators H and B we have used ei- 
ther high-order (0(h 4 ) oeJaigher) Mehrstellen or central 
difference stencils (CDS)Em The use of high-order sten- 
cils reduces remarkably the density of grid points needed. 
The benefit of the Mehrstellen scheme is that more local 
information is used. The scheme leads to controlled ac- 
curacy and convergence properties and to more isotropic 
smoothing of the error in comparison with the use of 
CDS's. The local nature enables also a more efficient par- 
allel coding. As the prolongation operator l[ we usually 
use trilinear interpolation and as the restriction operator 
7| its transpose, the so-called full-weighting operator, in 
which the coarse-grid values are chosen to be the av- 
eraged values of the surrounding fine grid points. The 
integrations are performed by the trapezoidal rule. 

Next we consider the generalization of the RQMG 
method to the simultaneous solution of several (N) mu- 
tually orthogonal eigenpairs. The separation of the dif- 
ferent states is divided into two or three subtasks. First, 
in order to make the coarse grid relaxations converge to- 
wards the desired state we apply a penalty functional 
scheme. Given the k lowest eigenf unctions, the next low- 
est, (k + l)'th state is searched for by minimizing the 
functional 

(u k+1 \H\u k+1 ) | (Uj\u k+1 ) 2 

(u k+1 \B\u k +i) ^ 1 (ui\ui) ■ (u k+1 \u k+1 ) ' 

The overlap integral in the penalty term is squared to 
make the penalty positive definite. The denominator is 
required to make the functional independent of the norms 
of Ui, % = 1 . . .k+ 1. The minimization of this functional 
is equivalent to imposing the orthonormality constraints 
against the lower k states, when qi oo. By increasing 
the shifts qi any desired accuracy can be obtained, but 
in order to obtain a computationally efficient algorithm 
a reasonable finite value should be used, for example 

q, = (A fc+ i - Ai) + Q, (18) 

where Q is a sufficiently large positive constant. In our 
test calculations Q is of the order of Q = 0.5 ... 2 Ha. 

We minimize the expression (|l7]) simultaneously for 
all N states. This simplifies the algorithm and enables a 



future parallelization over the eigenstates. Thus the cur- 
rent approximations are used for m, i = 1 . . . k. More- 
over, changes in the Ui during a given relaxation sweep 
are not used to update the penalty term in Eq. (|l7|). 
This is sufficient, when the states are always ordered in 
the same way, in the order of increasing eigenvalue. In 
order to reduce computations, the -B-innerproduct is ac- 
tually used in calculating the penalty term integrals be- 
cause the values of Bu are readily available from the finer 
level. The substitution ( |l3| ) is introduced in the func- 
tional p7[ ) and the minimization with respect to a leads 
again to a quadratic equation. This time the coefficients 
contain terms due to the penalty part. 

On the finest level, we do not apply the minimization 
of the penalty functional. The ideal situation would be 
if a residual minimization method, such as the Gauss- 
Seidel method, would keep the states calculated on the 
coarse levels separated. We found out in practical calcu- 
lations that this is not true at least when the states are 
far from convergence. Therefore we have developed for 
the finest level a scheme which by employing Gramm- 
Schmidt orthogonalization and subspace rotation keeps 
the eigenstates orthogonal. The subspace rotation is a 
method to find the most optimally separated eigenvec- 
tors from the approximative ones. The major steps of 
the rotation are: 

(i) Calculation of the Hamiltonian matrix elements be- 
tween the current states: 

H id = { Ui \B- x H\ Uj ). (19) 

(ii) Calculation of the overlap matrix: 

Si j = (ui\uj). (20) 

The use of matrix elements of Eqs. ( |l9| ) and (|2(]) leads to 
eigenvectors orthogonal in the desired Euclidian sense (J- 
orthogonal) and not in the sense of the i?-innerproduct. 

(iii) Diagonalization to find the optimal eigenvectors 
(u' k = J2j A k jUj) and corresponding eigenvalues (X k ): 

HijAj :k — X k 2^ SijAj :k . (21) 

J 3 

In practice, we apply the approximation 

{uAB-^H^) « <t*| Ui >MjM. (22) 

{Ui\±SUj) 

The Gramm-Schmidt orthogonalization and the sub- 
space rotations are organized so that the space of the 
eigenvectors is first divided to small clusters correspond- 
ing to close eigenvalues. The Gramm-Schmidt orthogo- 
nalization is then performed for each cluster at a time 
so that its eigenvectors become orthogonal against the 
eigenvectors of the clusters of lower eigenvalues. Then a 
subspace rotation is performed within the states belong- 
ing to the present cluster. The division to clusters re- 
duces remarkably the cost of the subspace rotation. This 
is because the cost is proportional to 0(N 3 ), where N is 
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the number of states rotated. Moreover, the subspace ro- 
tation requires the calculation of matrix elements which 
are more complicated than those for the simple Gramm- 
Schmidt orthogonalization. 

According to our test calculations this subspace rota- 
tion scheme leads quite effectively to /-orthogonal eigen- 
states. This is seen as a convergence of the eigenvalue 
problem within the numerical accuracy, i.e. the residu- 
als of different eigenstates vanish. In order to prove this 
one has, in practice, to introduce potential shifts which 
reduce the number of significant digits in the eigenvalues 
so that the numerical accuracy of the eigenvalue does 
not prevent to reach the numerical accuracy of the wave- 
function, i.e. the vanishing residual. The error in the 
eigenvalue scales as the square of the residual. When ap- 
plying the subspace rotation it is important to complete 
the highest eigenvalue cluster; otherwise the rotation may 
become inefficient. 

The orthogonalization needed scales as 0(N 3 ). For 
small systems of several tens of eigenpairs this is not yet 
a problem. The algorithm is effective and the number 
of fine grid orthogonalizations remains quite plausible, 
for example, in comparison with the conjugate gradient 
search of eigenpairs employing only the finest gricEa. But 
for larger systems with hundreds of states it will be the 
bottleneck. One solution could be to rely on the finest 
level only on a residual minimization method when the 
initial stages of the iteration process have been performed 
and the solution is clearly on a stable track towards con- 
vergence. 



B. Strategy for self-consistency iterations 

The Kohn-Sham problem has to be solved self- 
consistently. This means that an optimal strategy is 
needed so that computing time is not wasted in the begin- 
ning of the self-consistency iterations to obtain unneces- 
sarily accurate wave functions, because these will change 
during the later iterations due to the changes in the po- 
tential. Updating the potential, including the solution of 
the Poisson equation, is a much less time-consuming task 
than the update of all the wavefunctions. Therefore the 
potential update can be performed frequentlyES. 

The examples of this paper are small-molecule and 
bulk-solid systems described by pseudopotentials. The 
strategy used is schematically presented in Fig. ^. Sim- 
ilar strategies can certainly be applied in other kind of 
Kohn-Sham calculations, for example in those employ- 
ing all-electron or jellium-type models. In the exam- 
ples of this work the initial electron density is the su- 
perposition of the pseudoatom densities centered around 
given nuclear positions. From the superposition we cal- 
culate the initial effective potential, where the wave- 
functions are solved accurately enough using the full- 
multigrid method. The FMG process is started from 
random numbers for the wavefunctions on the coarsest 



level. The accuracy of the wavefunctions is controlled by 
calculating the norms of the residuals of the eigenstates 
and it is finally improved by adding more V-cycles start- 
ing from the finest level. A certain accuracy is needed 
in order to initiate self-consistency iterations which con- 
verge without large density oscillations. Then the new 
electron density and the ensuing effective potential are 
calculated. The new potential is not directly fed into the 
next iteration but it is mixed in this place, as well as 
later between the self-consistency iterations, with the in- 
put potential of the iteration. We monitor the accuracy 
of the wave functions by calculating their residuals and 
require that the accuracy has improved from the previ- 
ous iteration. Usually one V-cycle is sufficient for this, 
because the changes in the potential are small. 



Non self-consistent solution 




FIG. 1. Strategy of self-consistency iterations. First, the 
wavefunctions are solved nonselfconsistently using the full 
multigrid method in the initial potential corresponding to the 
superposition of pseudoatoms. Then the effective potential is 
updated (this is denoted by P in the figure). The potential 
update amounts to calculation of the new electron density, 
the solution of the Poisson equation and calculation of the 
new exchange correlation potential. Next the wave-functions 
are updated by one V-cycle. These two steps are repeated 
until self-consistency has been reached. 

An important point is also to find a proper balance 
with respect to the pre- and postsmoothening sweeps on 
the different grid levels. Typically, on the finest level 
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we made two pre- and postsmoothening sweeps whereas 
on the coarser grids their number is four. Actually, this 
means that on the finest level four successive sweeps are 
done if the potential is not updated. A potential up- 
date is always preceded and followed by two immediate 
smoothcning sweeps. 

IV. TESTS 

We test the performance of our scheme by calculating 
the self-consistent electronic structures of a CO2 molecule 
as well as that of perfect bulk Si lattice with a supercell of 
64 Si atoms. The former system is an example of the em- 
ployment of Dirichlct boundary conditions and the use of 
"hard" pseudopotentials whereas the latter system rep- 
resents the use of periodic boundary conditions and a 
supercell size typical in electronic structure calculations 
for point defects in solids. 

The ions are described, by pseudopotentials of the 
Kleinman-Bylander forn£3, 

^on(r) — ^ ^ Vion ,loc ( | | ) 
a 

+ (frya \ \ AV ion : l( r a)ui m (r a )) 

a,n,lm \ lm/ 

x (AVion,i(r' a ) ) , (23) 

where ^AV^) is a normalization factor, 

)AV lon ,i(r a )u lm (r a )d 3 r, (24) 

and r Q = r — R a , ui m are the atomic pseudopoten- 
tial wave functions of angular and azimuthal momentum 
quantum numbers (l,m), from which the /-dependent 
ionic pseudopotentials T^p«,;(r) are generated using the 
Troullier-Martins schemeo. The ion core is assumed 
to be spherically symmetric. AV ion j(r) = V ion ^(r) - 
Vi ont i oc (r) is the difference between the /-component of 
the ionic pseudopotcntial and the local ionic potential. 
We have chosen the s-component of the pseudopotential 
as the local component. 

Because the functions it; m (r) are short-ranged, oper- 
ating on the wave-function by the nonlocal parts of the 
pseudopotential is in practice a multiplication by a sparse 
matrix. The numerical work required to compute this 
scales as the square of the number of atoms in the sys- 
tem, whereas in the conventional reciprocal-space formu- 
lation the work scales as the cube of the system size. 
The advantage of implementing the nonlocal pseudopo- 
tentials in real space has been noted also in the context 
of plane-wave methodsBj. 

In the previous mul±igrid implementations of the pseu- 
dopotential methodO'Ej, the nonlocal parts have only 
been employed on the finest grid. It is, however, straight- 
forward to implement them also on the coarse levels, and 



we have found that this may increase the convergence 
rate and stability of the method. 

The CO2 molecule is placed diagonically in the center 
of a cubic computation volume of the size of (12.6 ao) 3 . 
Experimental bond lengths are used. Dirichlct boundary 
conditions are used so that the potential values outside 
the cube are obtained from a multipole expansion of the 
charge density. The point mesh used is 63 3 , giving the 
grid spacing h ^.0.20 ao. The Mehrstellen discretization 
by Briggs et a/.O is used. 

In this calculation we used a mixing scheme, where the 
new effective potential is obtained from the input 

and output potentials according to 

^ = (1 + (25) 




5 10 15 

SCF-ITERATION 



FIG. 2. Convergence of the total energy for the 
CO2 -molecule using direct mixing with different values of the 
feedback parameter re; re = 0.4 (solid line), re = 0.5 (dashed 
line), re = 0.6 (dashdotted line) and re = 0.7 (dotted line). A 
horizontal line has been added to indicate the chemical accu- 
racy of 1 meV. 

The convergence of the self-consistency iterations em- 
ploying the strategy described above (Fig. [|) is shown 
in Fig. |^. The deviation of the total energy from the 
converged value is given as a function of self-consistency 
iteration steps performed. The zeroth iteration is a full- 
multigrid solution for the wavefunctions in the initial po- 
tential. Two V-cycles are performed on the finest level 
at this point. The effective potential obtained from the 
output electron density was mixed with the initial po- 
tential using the feedback k = 0.4. Next, at iteration 
one, the wave-functions are relaxed in this new potential 
using one V-cycle. From this point on, the four curves 
in the figure give the convergence with different values 
of the feedback parameter re. One V-cycle per one self- 
consistency iteration step is done. A wide range of val- 
ues for k gives satisfactory convergence indicating a ro- 
bust behaviour for the scheme. The accuracy of 1 meV, 
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which is sufficient in practical calculations, is reached af- 
ter three or four V-cycles. The implementation of the 
non-local parts of the pseudopotential on the coarse lev- 
els is found to speed up the convergence especially in this 
region. From Fig. we obtain an average convergence 
rate of approximately one decade per self-consistency it- 
eration. This igL_of the same order as those reported by 
Wang and Beckta in their FAS scheme or by Kresse and 
Furthmullera in their pseudopotential scheme employ- 
ing self-consistency iterations. The convergence rate of 
one decade per self-consistency iteration is better than 
that obtained by Ancilotto et alx-J in the FMG scheme 
and much better than the rate reached in the linearized 
multigrid scheme by Briggs et alE-i. 



FIG. 3. Valence electron density in the (llO)-plane ob- 
tained in the F-point calculation for the 64-atom supercell 
of bulk Si. The area of the figure corresponds to the extent 
of the supercell. 

We have solved for the electronic structure of perfect 
Si lattice described by a supercell of 64 Si ions. The 
lattice constant of 20.38 ao used is the equilibrium value 
obtained in a plane- wave calculation, with which we have 
compared our real-space results. The first Brillouin zone 
is sampled in this test using the T-point only. The point 
mesh used for the wave-functions is 64 3 , giving the grid 
spacing h = 0.32 ao- For the densities and potentials 
we use a finer grid of 128 3 points. The other numerical 
parameters and the iteration strategy are the same as 
in the CO2 test. The resulting valence electron density 
on the (llO)-plane is given in Fig. |^. The area of the 
figure corresponds to the extent of the supercell. One 
notes that exactly the same features are reproduced at 
the equivalent points in different regions of the supercell. 
This means that a fully converged result has been found. 
We have compared the results of our real-space code to 
those obtained using the plane- wave method. The energy 
cutoff, 18 Ry, of the plane-wave expansion was chosen so 
that it results in a real-space point mesh of 64 3 , i.e. it 
is the same as in our real-space calculation. The widths 
of the valence band and band gaps obtained by the two 
methods agree with an accuracy of 3 meV. In the case 
of degenerate eigenstates the real-space code results in 



degenerate eigenenergies with an accuracy better than 1 
meV. The convergence towards to the self-consistent so- 
lution occurs similarly as for the CO2 molecule in Fig. ||. 
Thus, the convergence process seems to be independent 
of the size of the system. 



V. SUMMARY AND OUTLOOK 

In this work we have generalized the, RQMG method 
introduced by Mandel and McCorrnickc-3 for the simulta- 
neous solution of a desired number of lowest eigenenergy 
states. This approach can be viewed as belonging to a 
third group of multigrid methods, in addition to FAS and 
the techniques where the eigenproblem is linearized (e.g. 
inverse iteration. In principle, one can use arbitrarily 
coarse grids in RQMG, whereas in the other multigrid 
methods one has to be able to represent all the states on 
the coarsest grid. 

We have demonstrated the feasibility of the method 
by electronic structure calculations for the CO2 molecule 
and bulk Si described by pseudopotentials. Our strategy 
for the self-consistent solution consists of a full-multigrid 
solution for the wave-functions in the initial potential, 
and subsequent self-consistency iterations. Less than five 
V-cycles are generally sufficient for practically sufficient 
accuracy. The cpu-times required for the FMG and SCF 
steps are roughly equal. 

We have applied the method also in two-dimensional 
problems for quantum dots employing the current-spin- 
density functional theory, in three-dimensional cylin- 
drically symmetric systems, and also for calculation of 
positron states in solids. 

We believe that our method will eventually compete 
with the standard plane-wave methods for electronic 
structure calculations. However, some straightforward 
programming is still required. For calculations, where 
the optimization of the ionic structure is necessary, the 
Hellmann-Feynman forces will be implemented. In or- 
der to remove the spurious dependence of the total en- 
ergy on the position of the atoms with respect to the 
grid points, Fourier-filtering of the pseudopotentials is re- 
quired. Complex wave functions for any k-point are easily 
implemented, and are already in use in two-dimensional 
geometries. 

Parallelization over k-points can be done easily. One 
only needs to communicate the electron density and ef- 
fective potential at the end of each V-cycle. During 
the RQMG V-cycle, the states are all relaxed simulta- 
neously and independently of each other. Therefore par- 
allelization over states is natural and easy to implement. 
However, for larger systems the Gramm-Schmidt orthog- 
onalization becomes very inefficient in a state-parellel 
code. The most efficient and yet straightforward choice 
is real-space data decomposition, where each processor is 
mapped to a specific region of space. 
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